A Boundedness Theoretical Analysis for GrADPDesign : A Case Study on Maze Navigation Report
نویسندگان
چکیده
A Boundedness Theoretical Analysis for GrADPDesign: A Case Study on Maze Navigation Report Title A new theoretical analysis towards the goal representation adaptive dynamic programming (GrADP) design proposed in [1], [2] is investigated in this paper. Unlike the proofs of convergence for adaptive dynamic programming (ADP) in literature, here we provide a new insight for the error bound between the estimated value function and the expected value function. Then we employ the critic network in GrADP approach to approximate the Q value function, and use the action network to provide the control policy. The goal network is adopted to provide the internal reinforcement signal for the critic network over time. Finally, we illustrate that the estimated Q value function is close to the expected value function in an arbitrary small bound on the maze navigation example. Conference Name: Proc. Int. Joint Conf. Neural Networks (IJCNN'15) Conference Date: July 13, 2015 A Boundedness Theoretical Analysis for GrADP Design: A Case Study on Maze Navigation Zhen Ni, Xiangnan Zhong, and Haibo He Department of Electrical, Computer, and Biomedical Engineering University of Rhode Island Kingston, RI, USA 02881 Email: {ni,xzhong,he}@ele.uri.edu Abstract—A new theoretical analysis towards the goal representation adaptive dynamic programming (GrADP) design proposed in [1], [2] is investigated in this paper. Unlike the proofs of convergence for adaptive dynamic programming (ADP) in literature, here we provide a new insight for the error bound between the estimated value function and the expected value function. Then we employ the critic network in GrADP approach to approximate the Q value function, and use the action network to provide the control policy. The goal network is adopted to provide the internal reinforcement signal for the critic network over time. Finally, we illustrate that the estimated Q value function is close to the expected value function in an arbitrary small bound on the maze navigation example.
منابع مشابه
Melatonin improves spatial navigation memory in male diabetic rats
The aim of the present study was to evaluate the effect of melatonin as an antioxidant on spatial navigation memory in male diabetic rats. Thirty-two male white Wistar rats weighing 200 ± 20 g were divided into four groups, randomly: control, melatonin, diabetic and melatonin-treated diabetic. Experimental diabetes was induced by intraperitoneal injection of 50 mg kg-1 streptozotocin. Melatonin...
متن کاملVitamin D Deficiency Impairs Spatial Learning in Adult Rats
Background: Through its membrane and intracellular receptors, vitamin D regulates many vital functions in the body including its well known actions on musculoskeletal system. Growing body of evidences demonstrate that vitamin D undergoes some of behavioral aspects of neurocognition. The present study was designed to evaluate the effect of food regimens without vitamin D or with a supplement of ...
متن کاملPractical Evaluation of EKF1 and UKF2 Filters for Terrain Aided Navigation
This article would study batch and recursive methods that used in terrain navigation systems. Terrain navigation has a lot ofdisadvantages and so researchers have been studied on different method of aided navigation for many years. Therefore, more types of aided navigation systems were introduced with advantages and disadvantages in terms of practical and theoretical. One of the main ideas for ...
متن کاملInformation Architecture of Research Institutes’ Website, Case Study: Iranian Research Institute for Information Science and Technology’s Website
Purpose: As mission-oriented organizations, research institutes have the task of answering community questions in specialized areas, and should therefore be able to effectively present their outputs to their target users. Achieving such a goal requires the proper use of information architecture principles to properly organize the information platform in which the research institutes interact wi...
متن کاملNavigation in Determining the Physical Factors Affecting Creativity of Children's in Urban Parks
Despite the availability of extensive facilities for children, the effect of environment on creativityof children is often ignored. It is a fact that children can attend the playgrounds in city parks, independently, from age6, therefore they become exposed to influence of the environment during this age period. It is necessary to designplaygrounds for children to improve their creativity. The o...
متن کامل